Fail-Stutter Fault Tolerance
نویسندگان
چکیده
Traditional fault models present system designers with two extremes: the Byzantine fault model, which is general and therefore difficult to apply, and the fail-stop fault model, which is easier to employ but does not accurately capture modern device behavior. To address this gap, we introduce the concept of fail-stutter fault tolerance, a realistic and yet tractable fault model that accounts for both absolute failure and a new range of performance failures common in modern components. Systems built under the fail-stutter model will likely perform well, be highly reliable and available, and be easier to manage when deployed.
منابع مشابه
Fault Tolerance
Permanent faults : This type of failure is persistent: it continues to exist until the faulty component is repaired or replaced. Examples of this fault are disk head crashes, software bugs, and burnt-out power supplies. Any of these faults may be either a fail-silent failure (also known as a fail-stop) or a Byzantine failure. A fail-silent fault is one where the faulty unit stops functioning an...
متن کاملAutomated design of efficient fail-safe fault tolerance
Both the scale and the reach of computer systems and embedded devices have been constantly increasing over the last decade. As such computer systems become pervasive, our reliance on such systems increases, resulting in our expectation for such systems to continuously deliver services, even in the presence of faults, that is we expect the computer systems to be dependable. One way to ensure the...
متن کاملA Fail-Safe CMOS Logic Gate
This paper reports a design technique to make Complex CMOS Gates fail-safe for a class of faults. Two classes of faults are denned. The failsafe design presented has limited fault-tolerance capability. Multiple faults are also covered.
متن کاملA Conceptual Framework for System Fault Tolerance
A major problem in transitioning fault tolerance practices to the practitioner community is a lack of a common view of what fault tolerance is, and how it can help in the design of reliable computer systems. This document takes a step towards making fault tolerance more understandable by proposing a conceptual framework. The framework provides a consistent vocabulary for fault tolerance concept...
متن کاملTiered Fault Tolerance for Long-Term Integrity
Fault-tolerant services typically make assumptions about the type and maximum number of faults that they can tolerate while providing their correctness guarantees; when such a fault threshold is violated, correctness is lost. We revisit the notion of fault thresholds in the context of long-term archival storage. We observe that fault thresholds are inevitably violated in longterm services, maki...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001